Ambiguity in German connectives: A corpus study

نویسندگان

  • Angela Schneider
  • Manfred Stede
چکیده

For deriving information on text structure (in the sense of coherence relations holding between neighbouring text spans), connectives are the most useful source of evidence on the text surface. However, many potential connectives also have a non-connective reading, and thus a disambiguation step is necessary. This problem has received only relatively little attention so far. We present the results of a corpus study on German connectives, designed to estimate the magnitude of the ambiguity problem and to prepare the development of disambiguation procedures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovery of Ambiguous and Unambiguous Discourse Connectives via Annotation Projection

We present work on tagging German discourse connectives using English training data and a German-English parallel corpus, and report first results towards a more comprehensive approach of doing annotation projection for explicit discourse relations. Our results show that (i) an approach based on a dictionary of connectives currently has advantages over a simpler approach that uses word alignmen...

متن کامل

Connective-based Local Coherence Analysis: A Lexicon for Recognizing Causal Relationships

Local coherence analysis is the task of deriving the (most likely) coherence relation holding between two elementary discourse units or, recursively, larger spans of text. The primary source of information for this step is the connectives provided by a language for, more or less explicitly, signaling the relations. Focusing here on causal coherence relations, we propose a lexical resource that ...

متن کامل

Toward a Bilingual Lexical Database on Connectives: Exploiting a German/Italian Parallel Corpus

English. We report on experiments to validate and extend two language-specific connective databases (German and Italian) using a word-aligned corpus. This is a first step toward constructing a bilingual lexicon on connectives that are connected via their discourse senses. Italiano. Presentiamo una serie di esperimenti per validare ed estendere due database dei connettivi, che sonospecifici per ...

متن کامل

Easily Identifiable Discourse Relations

We present a corpus study of local discourse relations based on the Penn Discourse Tree Bank, a large manually annotated corpus of explicitly or implicitly realized relations. We show that while there is a large degree of ambiguity in temporal explicit discourse connectives, overall connectives are mostly unambiguous and allow high-accuracy prediction of discourse relation type. We achieve 93.0...

متن کامل

The Potsdam Commentary Corpus

A corpus of German newspaper commentaries has been assembled and annotated with different information (and currently, to different degrees): part-of-speech, syntax, rhetorical structure, connectives, co-reference, and information structure. The paper explains the design decisions taken in the annotations, and describes a number of applications using this corpus with its multi-layer annotation.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012